Outlier Detection in Random Subspaces over Data Streams: An Approach for Insider Threat Detection
نویسندگان
چکیده
Insider threat detection is an emergent concern for industries and governments due to the growing number of attacks in recent years. Several Machine Learning (ML) approaches have been developed to detect insider threats, however, they still suffer from a high number of false alarms. None of those approaches addressed the insider threat problem from the perspective of stream mining data where a concept drift or an outlier is an indication of an insider threat. An outlier refers to anomalous behaviour that deviates from the normal baseline of community’s behaviour and is the focus of this paper. To address the shortcoming of existing approaches and realise a novel solution to the problem, we present RandSubOut (Random Subspace Outliers) approach for insider threat detection over real-time data streaming. RandSubOut allows the detection of insider threats represented as localised outliers in random feature subspaces, which would not be detected over the whole feature space, due to dimensionality. We evaluated the presented approach as an ensemble of established distance-based outlier detection methods, namely, Micro-cluster-based Continuous Outlier Detection (MCOD) and Anytime OUTlier detection (AnyOut), according to evaluation measures including True Positive (TP) and False Positive (FP).
منابع مشابه
Evolving Insider Threat Detection Stream Mining Perspective
Evidence of malicious insider activity is often buried within large data streams, such as system logs accumulated over months or years. Ensemble-based stream mining leverages multiple classification models to achieve highly accurate anomaly detection in such streams, even when the stream is unbounded, evolving, and unlabeled. This makes the approach effective for identifying insider threats who...
متن کاملDeep Learning for Unsupervised Insider Threat Detection in Structured Cybersecurity Data Streams
Analysis of an organization’s computer network activity is a key component of early detection and mitigation of insider threat, a growing concern for many organizations. Raw system logs are a prototypical example of streaming data that can quickly scale beyond the cognitive power of a human analyst. As a prospective filter for the human analyst, we present an online unsupervised deep learning a...
متن کاملIdentification of outliers types in multivariate time series using genetic algorithm
Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...
متن کاملOutlier Detection in Wireless Sensor Networks Using Distributed Principal Component Analysis
Detecting anomalies is an important challenge for intrusion detection and fault diagnosis in wireless sensor networks (WSNs). To address the problem of outlier detection in wireless sensor networks, in this paper we present a PCA-based centralized approach and a DPCA-based distributed energy-efficient approach for detecting outliers in sensed data in a WSN. The outliers in sensed data can be ca...
متن کاملA Novel Subspace Outlier Detection Approach in High Dimensional Data Sets
Many real applications are required to detect outliers in high dimensional data sets. The major difficulty of mining outliers lies on the fact that outliers are often embedded in subspaces. No efficient methods are available in general for subspace-based outlier detection. Most existing subspacebased outlier detection methods identify outliers by searching for abnormal sparse density units in s...
متن کامل